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1.  INTRODUCTION 


It  is  by  now  well-known  that  if  XpXl~N(0  ,cr2I) , 
p  >_  3,  o2  known,  X  is  an  inadmissible  estimator  of 
6  under  arbitrary  positive  definite  quadratic  loss. 

If  o2  is  unknown  but  an  independent  estimate  of  a2 
is  available,  this  conclusion  still  holds.  An  iden¬ 
tity  due  to  Stein  (1973)  has  been  instrumental  in 
this  development.  A  by-product  of  this  identity  is 
the  fact  that  under  a  simple  integrability  condition 
a  unique  unbiased  estimator  of  the  risk  of  estimators 
of  the  form  X  +  f(X),  f  ,  can  be  provided.  Efron 

px  JL 

and  Morris  (1976)  discuss  this  issue  in  detail  both 
with  o2  known  and  unknown.  They  note  that  these  risk 
estimators  may  be  employed  to  select  a  best  estimator 
from  a  class  of  estimators. 

We  address  the  following  practical  issue.  If  we 
select  an  estimator  of  9  known  to  improve  upon  X,  how 
should  we  estimate  this  improvement?  The  above  unbiased 
estimator  need  not  be  admissible  under  squared  error 
loss;  several  attractive  alternatives  will  be  proposed. 

2.  ALTERNATIVE  ESTIMATORS 
Suppose  X'-KHS,!)*  Let  Z  =  XTX  and  consider  the 
crude  James-Stein  estimator  (pulled  toward  6=  C  w.l.o.g. 


(l-(p-2)/Z)X.  Under  loss  v.l*-a)  (0-a),  this  estimator 
uniformly  improves  upon  X  with  improvement 


I  =  g(X)  »  (P-2)2  Ee(Z_1) 


where  X  *  0T0/2.  Hence  I1  =  (p-2)2  Z_1  is  immediately 
unique  unbiased  for  g(X).  Obviously  g(X)  decreases  in 
X  from  g(0)  =  p-2  (worthwhile  improvement  occurs  when 
X  is  small).  Hence 


*2  = 


(p-2)2  Z^1  ,  Z  >  p-2 


(P-2) 


,  Z  <  P-2 


dominates  Ij .  In  fact,  it  is  clear  that  I,  can  be 
significantly  improved  upon  when  X  is  small  since  in 
this  case  var(I^)  -  2(p-10  ^(p-2)2  (the  value  at  X  *  0) 
which  is  more  than  twice  I  regardless  of  p.  Alternatives 

a 

to  I2  will  now  be  developed  using  the  fact  that  Z  has  a 
noncentral  chi-square  distribution,  i.e.  Z-Xp(X).  g(X) 
is  then  a  function  of  the  noncentrality  parameter  with 
the  explicit  form 


g( X )  =  (P-2)2  Z  \le~X{p-2+2l)~1/ll 
1=0 


(2.1) 


The  UK  VUE  for  X  is  (Z-l;/2  suggesting  the  estimator 


*3  ' 


g((Z-p)/2,  ZIP 
P-2  ,  Z  <  p 

The  KLE  for  X  has  been  discussed  by  Meyer  (1967).  With 
a  single  multivariate  normal  distribution,  it  is  clearly 
Z/2  whence  g(Z/2)  is  the  KLE  for  g(X).  g(Z/2)  is  always 

less  than  and  when  X  is  small,  regardless  of  p,  this 
underestimation  badly  inflates  mean  square  error  relative 


to  that  of  I^. 


Improvements  under  squared  error  loss  to  the  UMVUE  for 
X  are  discussed  in  Perlman  and  Rasmussen  (1975)  and  in 
Neff  and  Strawderman  (1976).  One  such  estimator  is 
X*  =  [ ( Z-p )/2  +  (p-4)/Z]+,  p  1  5,  suggesting  the  estimator 
Ijj  =  g(X*).  However,  1^  <_  and  simulation  (to  be 
described  shortly)  shows  that  for  X  small  the  (p-^)/Z 

A 

term  drastically  inflates  the  bias  and  variance  of  1,^ 
relative  to 

A  more  direct  approach  is  to  investigate  Bayes 

estimators  for  g(X)  under  squared  error  loss.  Suppose 

5  2 

8-N(0,yD,  i.e.  X  -  y/2  Xp  whence  X|Zr-a/2  Xp(aZ/2) 
where  a  ■  y/(y^^)*  We  seek  E(g(X)|Z).  Using  (2.1)  we 
obtain 


E(g( X ) I Z )  =  (p-2)2  I  {p-2+2l)~1/l\  /X£e_Xf(X|Z)dX. 


Letting  s  =  (p-2)/2  and  performing  the  integration,  we 
obtain 


2  - 


3  -aZ/2 


E(g(X)|Z)  =  (p^2)-  Z  (§£) 

J  =0  * 


(2.2) 


We  may  interpret  (2.2)  with  J-Poisson(aZ/2) ,  L|J  -  Negative 


Binomial  (— s+J+1)  (where  s  need  not  be  an  integer)  and 


2 

:(g(X)[Z)  =  (P~-2-l  E(E(  ( s+L)-1 1  J )  )  .  (2.3) 


The  identity 


^Z^(s+i)“1(5+5;+'3)(i|T))l+s 


j 


_K+s 


=  t  (£)(k+s)  -1  a 
k=0  K 


(derivable  by  considering  the  indefinite  integral  with 

s— 1  1 

respect  to  a  of  a  (a+l)J  directly  or  through  its 


equivalent  negative  binomial  expansion)  leads  to 


E(g(X)|Z)  =  (p-2)  /2(a+l)  z  Z  e  ‘  (rf^r)  <£)(k+s)  Aa\ 

j  =0  ka0  j!  aU n)  k 

Interchanging  order  of  suimnatlon  and  summing  over  J  ,  we 
obtain  the  Bayes  rule  in  simple  form: 

E(g(X)|Z)  *  (a+l)'1g(a2Z/2(a+l))  .  (2.4) 

Letting  y-K° y  i.e.  a  1,  in  (2.4)  leads  to  1^  *  1/2  g(Z/4). 

As  noted  in  Perlman  and  Rasmussen,  this  resultant 

"noninformative"  prior  on  0  yields  a  prior  on  X  which  is 

nonuniform  and,  in  fact,  is  biased  against  small  X.  The 

fact  that  I_  is  at  most  (p-2)/2  reflects  this.  A  more 
5 

appealing  "empirical"  Bayes  estimator  is  obtained  by  esti¬ 
mating  (y+1)-1  in  (2.4)  by  pZ-1  (as  in  Perlman  and  Rasmussen) 
resulting  in 


_Z _ 

2  Z-p 


(z-p). 

2 


Z  >  p 


1  p-2  ,  Z  <  p 

From  (2.1),  g(X)  =  (p-2)2E(?-2+2L)-1  where  L~Poisson( X) . 

Consider  the  following  conditional  estimation  problem. 

2  -1 

Treating  L  as  a  parameter,  estimate  =  Cp-2)  (p-2+2L) 


This  idea  is  also  suggested  by  (2.3)  and  such  an  approach 


m TOWWfW i W W iw vvww^wwwHr p » i  w 


was  discussed  by  Stein  (1964)  in  conjunction  with  the 
estimation  of  the  variance  of  a  normal  distribution  with 


unknown 


mean.  Given  L,  Z  -  p+2L  anc*  we  seek  estimators 


based  on  Z  of  y(L).  In  this  setting  1^  is  again  the 
UMVUE  and  again  the  truncation  in  I2  is  appropriate. 
Since  E(Z-2|L)  *  p-2+2L,  the  estimator 


r  (p-2)2(Z-2)-1  ,  Z  >  p 
>  P-2  ,  Z  <  p 


may  be  considered.  1^  >.  I2-  *r°r  smaH  *  simulation  reveals 

a 

1^  to  be  less  biased  with  smaller  variance  than  I2.  For 
large  X,  Z  will  likely  be  greater  than  p  whence 

A  p  _  *1  A  A  A 

1^  a  (p-2)  (Z-2)  ,  I2  *  1^.  Nov/  I2  is  nearly  unbiased 

and  simulation  reveals  that  I2  will  have  smaller  variance 

a 

than 

The  MLE  of  L  is  not  easily  obtained,  hence  for  y  as 

2 

well.  Under  squared  error  loss,  (y(L)-a)  ,  shrinkage  of 

A  A 

1^  is  suggested,  i.e.  amongst  estimators  of  the  form  el^ 
the  optimal  e  is  (p-2+2L)~'1'(p-4+2L) .  Consider  Bayes  esti¬ 


mates  of  y(L)  under  this  loss.  Denoting  the  prior  on  L 
by  7r,  the  Bayes  rule  vs.  n,  5  (Z),  becomes 


i  ‘ 


5/Z)  * 


(p-2)2  Z  ( p-2+2 £ )-1  ( Z/2 ) ^  ir(i)/r(Hiil) 
_ £^0 _ 

Z  (Z/2)2,  ir(i)/r(E|^) 

£=0 


If  we  define 

h  (Z)  =  Z  (p-2+2£)"1(Z/2)(p'1  +  J')/2  ir(£)/r(B+2i)  (2.5) 

*  £= 0 

then  straightforwardly. 


5W(Z)  *  (p-2)\(Z)/2Zh’  (Z)  . 


(2.6) 


Clearly,  1-^  can't  be  Bayes  vs.  any  prior,  i.e.  we  would 

7/2 

need  h^(Z)  =  e  ,  impossible  by  equating  coefficients  in 

I 

(2.5),  A  shrinkage  estimator  arises  if  h^/h^  >  1/2. 

Two  priors  yielding  simple  expressions  for  5n(Z)  are: 

(i)  it ( £ )  *  (|  -1+£)T(|  -l+£)/£!  (mass  on  large  £)  result- 
ing  In  Ig  *  (p-2)  (p-2+Z)  .  Ig  underestimates  g(X)  and 
when  X  is  small  simulation  reveals  this  to  critically 
inflate  mean  squared  error.  (ii)  tt(0)  =  (p-2)/p,  ir(l) 

=  2/p  (mass  on  small  £)  resulting  in 

I  =  (p-2)2(p2(p-2)+2pZ)-1(p2+2Z) .  With  Increasing  X 
the  bias  in  Ig  tends  to  (p-2)2/p  again  critically  inflat¬ 
ing  mean  squared  error.  A  uniform  prior  on  L  does  not 


yield  an  estimator  in  closed  form.  However,  for  small  X, 


this  estimator  tends  to  shrink  I1;  for  large  x,  it  tends 
to  expand  1^. 

A  simulation  based  on  7000  replications  at  each  x 

and  p  was  developed  to  examine  the  I  ^  ,  j  =  1,...,9. 

Estimators  involving  g  are  most  easily  computed  from 

(2.1)  in  recursive  fashion  with  double  precision,  i.e. 

2  “ 

by  writing  g(*)  *  (p-2)  exp(»)  Z  b  where 

?=Q  4 

^i+l  =  ^p+24  *  JL+1  anc*  D0  =  (P“2)  .  Table  1  pre¬ 

sents  an  abbreviated  summary  for  the  best  performers,  I2, 
1^,  and  X y. 


3.  EXTENSIONS 

Extension  to  the  case  where  X~N(0,g2I),  a2  unknown, 

/V 

is  Immediate  if  an  independent  estimator  a2  of  a2  is 
available  based  on  a  chi-square  random  variable  with  r 
degrees  of  freedom.  The  simple  James-Stein  estimator 
becomes  (1  -  (p-2)ro2/(r+2)Z)X  which  under  squared  error 
loss  uniformly  improves  upon  X  with  relative  improvement 

(Pp")"  pT2  a2EZ_1  =  (3-D 

m  ^ 

where  X  =  8  0/2o2.  The  independence  of  a2  and  Z  enables 
straightforward  development  of  estimators  of  g(X)  paralleling 


I 

w 


10 


Xj  -  I^.  The  resulting  estimators  will  depend  on  oz 

A 

and  2  only  through  U  =  Z/o2 .  Details  are  omitted.  In 
the  context  of  estimation  in  a  full  rank  linear  model, 
let  Y  *  XB+e,  Xn  ,  r(X)  =  p  >  3,  e-N(0,o2I)  and  BQLS 
be  the  ordinary  least  squares  estimate  of  8.  Then 

TCBqls^  “  (1-A)80LS+AS*  where  A  =  °2  is  the 

UMVUE  of  o2,  Q  *  ‘(B0LS-6*)TXTX(B0LS-B*), 
c  ■  (p-2) (n-p)/(n-p+2)  and  8*  is  a  fixed  vector,  uni- 

a 

formly  improves  upon  BqL<j  under  loss  proportional  to 

(B-B)TXTX(B-B)  with  relative  improvement  ca2E(Q-1) 

analogous  to  (3-D-  Estimators  of  this  relative  improve 

ment  will  be  expressible  as  functions  of  the  "F-statisti 

2 

(n-p)U/p,  or  of  R  (8*),  the  sample  multiple  correlation 
coefficient  resulting  from  fitting  the  adjusted  regres¬ 
sion  model  Y-XB*  =  XB+e. 

In  extending  these  ideas  to  minimax  estimators  of 
6,  other  than  the  above  James-Stein  estimator,  orthogon¬ 
ally  invariant  estimators  of  the  form 
(1  _  (p-2)rx (U)/(r+2)U)X  have  been  shown  to  uniformly 
improve  upon  X  under  squared  error  loss  if,  for  example 
0  <_  t(*)  <  2  and  t(#)  nondecreasing  (Baranchik  (1970)). 
The  relative  improvement  of  such  estimators  is  shown  to 
be  (Efron  and  Morris) 


E  {(p~2)r(-p-jfj^2-:r(U))  +  4t'(U)(1  +  T  (U) )  | 

=  g(X) 

m 

where  X  again  is  8‘t8/2o2.  Without  specification  of  x 
it  is  unclear  as  to  whether  the  implicit  unbiased 
estimator  of  g(X)  is  admissible.  Nonetheless,  ideas 
of  the  previous  section  may  be  used  to  suggest  alter¬ 


native  estimators. 
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